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Abstract 

Background: Design of newly engineered microbial strains for biotechnological purposes would greatly benefit 
from the development of realistic mathematical models for the processes to be optimized. Such models can then 
be analyzed and, with the development and application of appropriate optimization techniques, one could identify 
the modifications that need to be made to the organism in order to achieve the desired biotechnological goal. As 
appropriate models to perform such an analysis are necessarily non-linear and typically non-convex, finding their 
global optimum is a challenging task. Canonical modeling techniques, such as Generalized Mass Action (GMA) 
models based on the power-law formalism, offer a possible solution to this problem because they have a 
mathematical structure that enables the development of specific algorithms for global optimization. 

Results: Based on the GMA canonical representation, we have developed in previous works a highly efficient 
optimization algorithm and a set of related strategies for understanding the evolution of adaptive responses in 
cellular metabolism. Here, we explore the possibility of recasting kinetic non-linear models into an equivalent GMA 
model, so that global optimization on the recast GMA model can be performed. With this technique, optimization 
is greatly facilitated and the results are transposable to the original non-linear problem. This procedure is 
straightforward for a particular class of non-linear models known as Saturable and Cooperative (SC) models that 
extend the power-law formalism to deal with saturation and cooperativity. 

Conclusions: Our results show that recasting non-linear kinetic models into GMA models is indeed an appropriate 
strategy that helps overcoming some of the numerical difficulties that arise during the global optimization task. 



1 Background 

Identifying optimization strategies for increasing strain 
productivity should be possible by applying optimization 
methods to detailed kinetic models of the target meta- 
bolism. Thus, a rational approach would pinpoint the 
changes to be done - e.g. by modulating gene expression 
- in order to achieve the desired biotechnological goals 
[1-4]. To build such models we can either start from a 
detailed description of the underlying processes (bot- 
tom-up strategy) or we can fit kinetic models to experi- 
mental data obtained in vivo (top-down strategy). 
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The bottom-up approach was the original strategy for 
model building in the biological sciences. Bottom-up 
kinetic models require information that is seldom avail- 
able, despite the increasing amount of kinetic data con- 
tained in a growing set of databases (for example see 
[5,6] and the webpage http://kinetics.nist.gov/kinetics/ 
index.jsp). Even in the best case scenarios where kinetic 
data are available, the data have often been obtained in 
different labs and under in vitro conditions that are 
hardly ever comparable or representative of the situation 
in vivo. In addition, models built using this strategy 
often fail to adequately reproduce the known behavior 
of the target system [7-10]. With the accumulation of 
time-series data, which were originally generated from 
the accurate measurement of transient responses, top- 
down modeling became viable as an alternative to the 
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bottom-up strategy [11]. However, top-down modeling 
also faces important difficulties. For example, regulatory 
interactions between metabolites and enzymes are very 
poorly characterized and most metabolic maps lack such 
crucial information. Therefore, for a given network 
structure (i.e. a stoichiometric description) obtained 
from databases, a large number of alternative regulatory 
patterns may exist that account for the observed experi- 
mental data [12]. Model discrimination among the alter- 
native regulatory patterns requires appropriate 
experimental design. However, this is seldom considered 
when performing the time series measurements. Last, 
but not the least, parameter identifiability in highly non- 
linear models can be problematic (for a review see [13]). 

An additional issue that is common to models built 
using both strategies is that such detailed kinetic models 
include non-convexities that lead to the existence of 
multiple local optima in which standard non-linear opti- 
mization algorithms may get trapped during the search. 
Several stochastic and deterministic global optimization 
methods have been proposed to overcome this limita- 
tion [14]. Deterministic methods, which are the only 
ones that can rigorously guarantee global optimality, 
rely on the use of convex envelopes or underestimators 
to formulate lower-bounding convex problems that are 
typically combined with spatial branch and bound stra- 
tegies. Most of these methods are general purpose and 
assume special structures in the continuous terms of the 
mathematical model. Because of this, they can encoun- 
ter numerical difficulties in specific metabolic engineer- 
ing systems that require the optimization of kinetic 
models with a large number of non-convexities of differ- 
ent nature. 

Given all these issues, it is hardly surprising that linear 
stoichiometric models have emerged as the most popu- 
lar tool to analyze genome-wide metabolic networks 
using optimization techniques. Linear optimization pro- 
blems can be solved using very fast and efficient algo- 
rithms [15,16] that are implemented in almost every 
kind of computer, ranging from laptops to cloud com- 
puting centers. In addition, such models require a rela- 
tively small amount of information. 

The possibility of condensing information about a very 
large network in a compact form enabled stoichiometric 
models to provide interesting insights in many different 
cases. However, the apparent simplicity in building and 
analyzing stoichiometric models comes at the cost of 
neglecting regulatory signals, metabolite levels and 
dynamic constraints. Accounting for these features in a 
dynamic way requires using more detailed, non-linear, 
mathematical models [17,18]. 

These models go a step further than stoichiometric 
models by incorporating regulatory influences through a 
set of ordinary differential equations that can account 



for the system's dynamics. Building such models is often 
impossible because the appropriate functional form that 
needs to be used to describe the dynamical behavior of 
specific processes is in general unknown. Modeling stra- 
tegies based on systematic approximated kinetic repre- 
sentations, such as power-laws [19-22], Saturating and 
Cooperative [23], or convenience kinetics [24], side-step 
this issue by providing uniform forms that are guaran- 
teed to be accurate over a range of conditions and 
reduce the amount of information required to build the 
models. Because of the regularity in the form of the 
mathematical function, models based on approximate 
formalisms can be automatically built from the reaction 
scheme of the target system. The model parameters can 
subsequently be estimated from experimental data using 
different procedures [13,25]. 

Although building and analyzing of comprehensive 
genome-wide detailed models is still not viable in most 
cases (see however [26,27]), developing ways to extend 
large scale optimization analysis to larger and more rea- 
listic non-linear kinetic models is an important part of 
the future of systems biology [18]. In fact, the optimiza- 
tion of certain types of non-linear problems can already 
be solved very efficiently and geometric programming 
problems with up to 1,000 variables and 10,000 con- 
straints can be solved in minutes on a personal 
computer. 

Efficient global optimization techniques are available 
for power-law models [1,28-30], either in S-system form 
or in Generalized Mass Action (GMA) form (for a 
review see [31]). In the case of S-system models, a sim- 
ple logarithmic transformation brings the model to a 
linear form [1]. In the case of GMA models, the pro- 
blem can be efficiently solved using branch-and-bound 
[28,32] and outer-approximation techniques [29,30]. 

The usefulness of the global optimization techniques 
developed for GMA models has been shown in the ana- 
lysis of the adaptive response of yeast to heat shock 
[29,33]. In essence, starting with a GMA model and 
considering a set of constraints on flux and metabolite 
values, we can obtain: (i) The pattern of enzyme activ- 
ities that maximizes a given objective, (ii) The region of 
feasible changes in enzyme activities so that the model 
fulfills a set of constraints on fluxes, metabolites, maxi- 
mum allowable change in activity, etc., and (iii) A heat 
map of how the objective function changes within the 
feasible region. These results share some similarities 
with those produced with stoichiometric models, but 
incorporate many additional features. 

Based on ideas similar to those that led to the devel- 
opment of the power-law formalism, Sorribas et al. [23] 
proposed a new Saturable and Cooperative (SC) formal- 
ism, that extends the power-law representation to 
include cooperativity and saturation. Although models 
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built using this new formalism loses some of the simpli- 
city inherent to the analysis of S-systems and GMA 
models, they tend to be accurate over a wider range of 
conditions than both the S-System and GMA represen- 
tations [23]. Thus, it is important to enlarge the scope 
of global optimization methods developed for power-law 
models in order to deal with the SC formalism and ana- 
lyze under which situations the later models behave bet- 
ter than the former. 

Optimization of SC models faces a number of practi- 
cal problems common to kinetic non-linear models 
[34,35]. To sidestep these problems, and in order to be 
able to use global optimization methods developed for 
power-law models, we will use a technique called recast- 
ing. Recasting permits the exact transformation of a 
continuous non-linear model with an arbitrary form 
into a canonical GMA model [36,37]. This transforma- 
tion is typically performed by increasing the number of 
variables of the original model. Through this technique, 
arbitrary non-linear models can be represented using a 
canonical form such as GMA or S-system that can be 
used for simulation and optimization purposes, which 
opens the door for effectively extending the optimization 
and feasibility analysis originally devised for GMA mod- 
els to other detailed kinetic models. 

In this paper, and as a first step to define a framework 
for optimization of non-linear models with arbitrary 
form and extend FBA and related approaches to detailed 
kinetic models, we shall show the practical utility of 
recasting SC models into GMA models for optimization 
purposes. This technique is similar to the symbolic 
reformulation algorithm proposed by Smith and Pante- 
lides [38]. Our method, however, focuses on obtaining a 
power-law representation that greatly facilitates global 
optimization, instead of continuing with the recasting 
until converting the model to a standard form contain- 
ing linear constraints and a set of nonlinearities corre- 
sponding to bilinear product, linear fractional, simple 
exponentiation and univariate function terms. After 
recasting the model to the canonical form, we can apply 
any of the optimization strategies we have presented for 
GMA models [29,32] to obtain the global optimum of 
the original SC problem. 




Figure 1 Branched network with feedback and feedforward 
regulation. X5 is a fixed external variable that can be varied at will. 
A GMA reference model is set-up by selecting appropriate 
parameters (see text). 



point. X 3 acts as a feed-back inhibitor of the synthesis of 
X 2 , while Xi is an activator of the synthesis of X 4 . 
The generic model for this system is: 



(i) 



Each of the velocities is a non-linear function of the 
involved metabolites. The SC representation, provides a 
systematic way for defining a functional model of this 
pathway. As a demonstrative example, let us suppose 
that the numerical model is: 



Xi 


= v\ 


-v 2 


x 2 


= v 2 


-V3-V5 


x 3 


= vs 


- V\ 


x 4 


= V5 


-v 6 



dX 1 _ 20kiXl 
dt 



40k 2 X{ 



(x^i)xl(i^) 



dX 2 
dt 



40k 2 X\ 



7.5fc 3 Xp 
X\ - 5 + 0.25 



\6k 5 X\X\ 



(2) 



(*i + 1) {A + 1) 

dX 3 7.5k 3 X 2 2 5 12k 4 Xl 



dt Xp + 0.25 X\ + 1 

dX 4 



\6k s X\X\ 



&k&X\ 

dt (X\ + l) (X] + l) X\ + 1 



2 Results 

2.1 Global optimization of non-linear models through 
recasting 

For a proof of concept of the difficulties of global opti- 
mizing non-linear models and of the use of recasting for 
attaining practical solutions, we shall start by defining a 
reference biochemical network that corresponds to the 
reaction scheme in Figure 1. This hypothetical system 
has a source metabolite X 5 and four internal metabo- 
lites. The network includes six reactions and a branch 



In these equations, /c r , r = 1,.., 6 is an auxiliary variable 
used to model changes in the enzyme activity. At the 
basal level, h r - 1 for all the reactions. During the opti- 
mization tasks, it is possible to limit the maximum 
change in gene expression by imposing a maximum 
allowable change in k r . 
We shall now address the following questions: 
(i) To what extent can general purpose global optimi- 
zation methods be applied to SC models?, (ii) Given 
that a SC model can be recast as a GMA (rGMA), is 
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this useful for optimization of the original SC model?, 
(iii) Are the results obtained with the rGMA equivalent 
to the results of the original SC model?, and (iv) What 
are the practical advantages of optimizing a rGMA 
model?. 

2.2 Optimization goals 

In order to address the questions posed at the end of 
the previous section we shall define the following opti- 
mizations tasks (note that changes in enzyme activities 
and metabolite concentrations are constrained between 
0.2 < k r < 5.0 and 0.1 < X t < 10.0 respectively in all the 
instances unless otherwise specified): 



Table 1 Results for the maximization of X 3 and v 4 and 
optimization goals 01-04 using BARON v.8.1.5. for a 
tolerance of 0.2% 



0 


*i 


k 2 


*3 


*4 


ks 


k 6 


x 3 


OG (%) 


CPU (s) 


1 


0.26 


5.00 


4.97 


0.20 


0.20 


0.54 


8.30 


0.20 


136.17 


2 


0.20 


0.24 


0.22 


0.20 


0.21 


0.20 


1.10 


0.00 


0.06 


3 


0.60 


5.00 


5.00 


0.53 


0.20 


0.27 


5.39 


0.20 


96.39 


4 


0.99 


1.15 


1.00 


0.96 


1.00 


1.00 


1.10 


0.00 


1.42 


O 


*i 


k 2 


*3 


*4 


k 5 


ke 


v 4 


OG (%) 


CPU (s) 


1 


4.61 


5.00 


5.00 


5.00 


0.72 


1.20 


37.40 


0.20 


157.83 


2 


3.22 


3.73 


5.00 


4.99 


0.21 


0.22 


31.33 


0.00 


1.67 


3 


0.88 


0.94 


0.88 


0.96 


0.23 


3.00 


6.60 


0.00 


10.53 


4 


1.16 


1.00 


1.34 


1.34 


1.00 


1.00 


7.61 


0.00 


3.61 



♦ Ol: What is the optimal pattern of changes in 
enzyme activities that maximizes the objective func- 
tion in the new steady-state for a fixed value of X 5 ? 

♦ 02: What is the optimal pattern of changes in 
enzyme activities that maximizes the objective func- 
tion in the new steady-state for a fixed value of X 5 
considering a maximum allowable variation of 10% 
in the steady-state values of the intermediaries? 

♦ 03: What is the optimal pattern of changes in 
enzyme activities that maximizes the objective func- 
tion in the new steady-state for a fixed value of X 5 
considering changes in the output flux from X 4 of 
less than 10% with respect to its reference value? 

♦ 04: What is the best set of changes, assuming that 
we can only manipulate three enzymes, that maxi- 
mizes the objective function in the new steady-state 
for a fixed value of X 5 considering a maximum varia- 
tion of 10% in the steady-state values of the 
intermediaries? 

Two different objective functions (OF), steady-state 
concentration of X 3 and flux v 4 , have been considered 
for each optimization case, except for 03. This latter 
case has been optimized in terms only of the first objec- 
tive (i.e., steady-state concentration of X 3 ), because lim- 
its on v 4 are already included in the formulation of the 
optimization problem. 

2.3 Global optimization of SC models using BARON 

We first address the optimization of the aforementioned 
model in their original SC form using state of the art 
global optimization techniques. The model was coded in 
the algebraic modeling system GAMS 23.0.2 and solved 
with the commercial global optimization package 
BARON v.8.1.5. on an Intel 1.2 GHz machine. An 
optimality gap (i.e., tolerance) of 0.2% was set in all the 
instances. As can be seen in Table 1, BARON produce 
results with an optimality gap (OG) below the specified 
tolerance. 



Table 1 only shows one solution for each particular 
instance. However, BARON identified in each case a set 
of equivalent optima (i.e, solutions with the same objec- 
tive function value) involving different changes in 
enzyme activities, which indicates that the optimization 
problem is somehow degenerated. This redundancy is a 
consequence of the system's structure and has practical 
implications. As an example, we have calculated some of 
these equivalent points for case 01-v 4 using the NumSol 
option of BARON (see Figure 2). In particular, a well 
defined triangular region containing the changes in k 2 
and /c 5 , and k\ and k 2 that lead to the same objective 
function value is identified. Within these regions, one 
can decide which combination of changes should be 
selected based on additional cost arguments, as they all 
show the same performance in terms of the predefined 
objective function. This region could be further reduced 
by imposing additional constraints to the optimization. 

2.4 Recasting SC models into GMA models 

Any SC model can be recast into a GMA canonical 
model by introducing the auxiliary variables 
z r j = K r j + X^ r \ Substitution and differentiation generates 
the following recast GMA (rGMA) model: 

p n+m 

Xi = £ VirVr 11 i = h ... n (3a) 

r=l j=l 

z rj = n r jXj r} 1 X j r=l,..,p ^ 3b j 

j = 1, n + m 

with appropriate initial conditions Xj(0) = Xj 0 and 

^•(0) = ^+^;. 

For simulation purposes, model (3) is equivalent to the 
original SC model. As discussed in [36], a model recast 
into a GMA model has the same steady-state that the 
original non-linear model. The steady-state equations of 
the rGMA model can be expressed as: 
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Figure 2 Equivalent optimal solutions for the case S1-01-v4. Blue points indicates results on the original SC model obtained with BARON. 
Red points identify solutions obtained for the corresponding rGMA and OA method (see text for details). 



EWrV r n*TV =0 i=l,..,n 

r=l j=l 



nrjXf 3 Xj = 0 r=l,..,p 
j = 1, n + m 



mm - OF 
(4a) s.t. 

K rj+ X^=z rjo 1 

j 

(4k) Specific constraints for each optimization task 

(Ol, 03 only) 

0.1<X<10 i 



OF = {X 3l v 4 } 



l,..,n 
- l,.,p 
1, n + m 



l,...,n 



2.5 Steady-state optimization of SC models through 
recasting 

The steady-state solutions of Eqn. (4b) satisfy also Eqn. 
(4a). Thus, for optimization purposes, the steady-state 
constraints of interest are: 



£/X ir V r n V Z r7 =° i = 1, .., Tl 

r=l j=l 



K rj+ X% =z rjo r=\,..,p 
j = 1, n + m 



(5a) 



(5b) 



According to these results, the optimization problem 
can be stated as: 



min — OF 



Kr i + X jo ~ Zr i° 



Xn < Xi < Xiu 
k rL <k r < k rU 
additional constraints 



OF = {Xi OTV r ] 

,., n 

..,n + m 
..,p 

.., n + m 
i = 1, n 
r = " 



(6) 



In our reference model, we shall consider the follow- 
ing constraints: 



(Ol, 02, 03 only) 

0.2 < k r < 5 

(02, 04 only) 

0 9X bas < x . < i AX f AS 

(03 only) and (OF : X 3 only) 



r=l, ...,p 



i = 1, n 



0.9vf' 



< y 4 < 



l.lyf 



(04 only) 

k r = k r i + k r 2 + k r 3 

tffyn <k n <(l-%rl 
(1 - 8)y r2 < k r2 < (1 +8)y r2 
(1 +6)y r3 <k r3 <k^ B Yr 3 

Yrl + Yr2 + Yr3 = 1 

ZiiYn+EiiYrt <ME = 3 



...,p 
..,P 
...,p 
...,P 
...,P 



(7) 



Once the problem has been recast into a rGMA, its 
mathematical structure can be exploited in order to 
improve the efficiency of the solution procedure, as 
demonstrated by the authors in previous works. This 
problem has a GMA form except for the auxiliary con- 
straint 5b, which is required to recast the SC into the 
rGMA. This constraint can be easily handled by means 
of relaxation techniques and exponential transforma- 
tions similar to those used by the authors in their global 
optimization algorithms for pure GMA models [32,33]. 
In particular, two algorithms were developed for the glo- 
bal optimization of GMA models: a customized outer- 
approximation (OA, [30]) and a tailored spatial branch- 
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and-bound (sBB, [32]). The authors showed that the 
numerical performance of these methods depends on 
the specific problem being solved, and that none of 
them is clearly better than the other one. Here, we use 
the OA algorithm to solve 6, as this method proved to 
be faster than sBB for problems of smaller size ([32]). 
Again, the main body of the algorithm was coded in 
GAMS 23.0.2, using CPLEX 11.2.1 as MILP solver for 
the master subproblems and CONOPT 3.14 s as NLP 
solver for the slave subproblems of the algorithm. For a 
fair comparison, we also set a tolerance of 0.2%, the 
same as when using BARON. 

As can be seen in Table 2, the optimization of the 
rGMA formulation using our customized OA yields 
similar results to those obtained when BARON is 
applied to the original SC model. In some cases, signifi- 
cant reductions in computational time are attained with 
our OA algorithm. While BARON took a total time of 
407.68 CPU seconds for solving the 8 instances, the cus- 
tomized OA algorithm solved the same problems in 8.5 
CPU seconds. 

Note that the objective function values obtained with 
the SC and rGMA models only differ within the toler- 
ance imposed. In some cases, discrepancies regarding 
the enzymatic profiles calculated are observed mainly 
due to the system's structure, that is, to the fact that the 
problem contains multiple solutions attaining the same 
performance in terms of objective function value but 
involving different enzymatic configurations, as dis- 
cussed in section 2.3. 

To further investigate this issue, we apply the multi- 
solution capability of BARON to the rGMA model (Fig- 
ure 2). Again, different equivalent optima are obtained, 
but this time the dispersion of the equivalent solutions 
generated for a given case tend to concentrate either in 
the center or in the extremes of the region containing 
the solutions with the same objective function value cal- 
culated with the SC model. 



Table 2 Results for the maximization of X 3 and v 4 using 
the rGMA model and optimization goals 01-04 using the 
customized OA for a tolerance of 0.2% 



0 




k 2 


ks 


k 4 


k s 


ke 


x 3 


OG (%) 


CPU (s) 


1 


0.26 


5.00 


5.00 


0.20 


0.20 


0.20 


8.30 


0.20 


2.94 


2 


0.21 


0.22 


0.21 


0.20 


0.20 


0.20 


1.10 


0.00 


0.06 


3 


0.60 


5.00 


5.00 


0.53 


0.20 


0.24 


5.40 


0.13 


2.35 


4 


1.00 


1.05 


0.97 


0.92 


1.00 


1.00 


1.10 


0.00 


0.23 


0 


*i 


k 2 


k 3 


k 4 


ks 


ke 


v 4 


OG (%) 


CPU (s) 


1 


3.96 


5.00 


5.00 


5.00 


0.20 


2.99 


37.47 


0.00 


0.16 


2 


3.22 


3.55 


5.00 


4.99 


0.20 


0.21 


31.33 


0.17 


0.66 


3 


0.68 


1.79 


1.12 


1.27 


0.20 


0.21 


6.60 


0.00 


0.12 


4 


1.16 


1.00 


1.34 


1.34 


1.00 


1.00 


7.61 


0.11 


1.98 



Table 3 Results (objective function) of the optimization 
of case 01- v 4 for specific regions of k 2 and k 5 obtained 
with BARON for the SC model 



k 5 /k 2 


1 


2 


3 


4 


5 


6 


7 


8 


8 


36.50 


36.71 


36.90 


37.08 


37.24 


37.37 


37.47 


37.47 


7 


36.62 


36.83 


37.02 


37.19 


37.34 


37.46 


37.47 


37.47 


6 


36.75 


36.95 


37.14 


37.31 


37.44 


37.47 


37.47 


37.47 


5 


36.88 


37.08 


37.26 


37.41 


37.47 


37.47 


37.47 


37.47 


4 


37.02 


37.21 


37.38 


37.47 


37.47 


37.47 


37.47 


37.47 


3 


37.15 


37.34 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 


2 


37.29 


37.46 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 


1 


37.43 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 



Domain of each k r {4 < k 2 < 5;0.2 < k 5 < 0.8) has been split into 8 intervals 
with equal width. 



The region illustrated in Figure 2 should not be mis- 
understood as a feasibility region. In fact, solutions do 
exist outside this region, but they lead to worse objec- 
tive function values. To further clarify this issue, we 
consider a grid of values for k 2 and k 5 in the region 
defined by constraints 4 < k 2 < 5 and 0.2 < k 5 < 0.8, 
and solve the optimization problem within each cell 
applying BARON to the SC model, and our OA to the 
rGMA model. Recall that these linear constraints 
define a region that contains that in Figure 2. The 
results obtained in this optimization are illustrated in 
Tables 3 and 4, and are exactly equal for both meth- 
ods. However, the CPU time is much lower when 
using our OA algorithm applied to rGMA (11,811 
CPU seconds for generating all the points with 
BARON applied to the SC model vs 17 CPU seconds 
with the customized OA applied to the rGMA model; 
as shown in Tables 5 and 6). 

2.6 Difficult optimization tasks can be solved via 
recasting 

The reference model can be optimized either by general 
purpose techniques or by rGMA specific methods such 
as the customized OA. However, even with this simple 



Table 4 Results (objective function) of the optimization 
of case 01 -v 4 for specific regions of k 2 and k 5 obtained 
with the customized OA for the rGMA model 



k 5 -k 2 


1 


2 


3 


4 


5 


6 


7 


8 


8 


36.50 


36.71 


36.90 


37.08 


37.24 


37.37 


37.47 


37.47 


7 


36.62 


36.83 


37.02 


37.19 


37.34 


37.46 


37.47 


37.47 


6 


36.75 


36.95 


37.14 


37.31 


37.44 


37.47 


37.47 


37.47 


5 


36.88 


37.08 


37.26 


37.41 


37.47 


37.47 


37.47 


37.47 


4 


37.02 


37.21 


37.38 


37.47 


37.47 


37.47 


37.47 


37.47 


3 


37.15 


37.34 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 


2 


37.29 


37.46 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 


1 


37.43 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 


37.47 



Domain of each /c r (4 < k 2 < 5;0.2 < k 5 < 0.8) has been split into 8 intervals 
with equal width. 
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example, we may encounter instances that are hard to 
solve using standard techniques. Consider, for instance, 
the same reaction scheme as before but this time with the 
alternative parameters indicated in the following model: 



dXi ll.llfciX?' 86 



dt Xf 86 + 0.81 



12.35fe 2 Xj" 



dX 2 
dt 



(XP + 0.61)^« (o.ll + 5 p) 

l235k 2 X\ 5A 
(XJ- + 0.61)X|- (o.H + ^p) 
4.44fe 3 X|- 14 



x|- 14 



+ 0.11 

7.41fe 5 X?- 51 Xf 



(8) 



(X? 51 + 0.19) (Xf 51 + 0.11) 



dX 3 4A4k 3 Xj- 14 



4.44fe 4 X 414 



dt X 414 +0.11 X 414 + 0.11 
dX 4 



7.41fe 5 X? 51 Xf 



dt (X? 51 + 0.19) (Xf 51 + 0.11) 
6.67k e X\ 57 
~ X\ 57 + 1.40 



The optimization task of interest being: 

♦ 05: Which is the optimal pattern of changes in 
enzyme activities that maximize v 6 in the new 
steady-state for a fixed value of X 5 and considering 
the following constraints? 



0.3 < Xi < 30 
0.1 <X 2 < 10 
0.1 <X 3 < 10 
0.6 <X 4 < 50 
0.1 < k r < 20 r = 



(9) 



When BARON is employed to solve this case using 
the native SC form, it cannot reduce the optimality gap 

Table 5 Results (CPU time in seconds) of the optimization 
of case 01- v 4 for specific regions of k 2 and k 5 obtained 
with BARON for the SC model 

k 5 /k 2 1 2 3 4 5 6 7 8 

8 212.53 308.53 185.64 201.80 222.30 201.53 139.16 178.31 
7 194.81 161.16 215.80 196.81 344.73 243.02 0.03 174.81 
6 234.30 203.75 147.08 180.69 328.34 254.42 304.11 280.53 
5 212.08 282.41 329.33 237.34 208.02 292.27 200.00 154.62 
4 288.00 160.14 92.94 235.80 172.69 147.14 56.11 150.28 
3 125.56 111.17 150.27 187.52 337.97 158.16 112.66 264.12 
2 239.70 190.59 100.03 138.47 106.38 205.14 119.39 246.34 
1 140.42 102.12 80.45 21.69 73.12 96.61 89.94 80.03 

Domain of each /c r (4 < k 2 < 5;0.2 < k 5 < 0.8) has been split into 8 intervals 
with equal width. 



Table 6 Results (CPU time in seconds) of the optimization 
of case 01 -v A for specific regions of k 2 and k 5 obtained 
with the customized OA for the rGMA model 



k 5 /k 2 


1 


2 


3 


4 


5 


6 


7 


8 


8 


0.13 


0.27 


0.23 


0.18 


0.17 


0.19 


0.28 


0.28 


7 


0.26 


0.28 


0.28 


0.26 


0.28 


0.23 


0.32 


0.25 


6 


0.32 


0.30 


0.28 


0.28 


0.27 


0.23 


0.19 


0.25 


5 


0.31 


0.21 


0.25 


0.25 


0.26 


0.28 


0.27 


0.29 


4 


0.25 


0.27 


0.32 


0.30 


0.25 


0.27 


0.26 


0.28 


3 


0.20 


0.22 


0.28 


0.28 


0.29 


0.30 


0.19 


0.53 


2 


0.28 


0.25 


0.19 


0.19 


0.22 


0.17 


0.30 


0.25 


1 


0.23 


0.24 


0.26 


0.27 


0.23 


0.21 


0.24 


0.31 



Domain of each /c r (4 < k 2 < 5;0.2 < k 5 < 0.8) has been split into 8 intervals 
with equal width. 

below the specified tolerance after 1 hour of CPU time. 
In contrast, when the model is recast into its rGMA 
form and our OA method is applied, the global opti- 
mum can be determined with an optimality gap of 2% 
in 10.95 seconds (see Table 7). This illustrates both, the 
utility of using the rGMA as a canonical form for deal- 
ing with the optimization of SC models, and the compu- 
tational efficiency of our global optimization methods 
specifically designed to take advantage of the mathema- 
tical structure of the GMA. 

3 Discussion 

While experimental tools to manipulate gene expression 
are already available, there is no established set of guide- 
lines on how these tools can be used to achieve a cer- 
tain goal. So far, two main difficulties have prevented 
model driven optimization from becoming a standard in 
providing such guidelines: (i) the lack of information to 
build detailed kinetic models and (ii) the computational 
difficulties that arise upon the optimization of such 
models. The latter can be exemplified by the application 
of mixed integer non-linear optimization techniques 
(MINLP) in the context of kinetic models presented in 
[34,35]. In such cases, the optimization task showed to 
be computationally very demanding and global optimal- 
ity could not be guaranteed in many cases. We propose 
that using models with a standardized structure may 
offer a solution to both problems. On one hand, approx- 
imate kinetics, such as the SC formalism, can provide 
very accurate approximations and retain key features of 
the system like saturation and cooperativity. On the 
other hand, these formalisms can be automatically recast 



Table 7 Results of the optimization of model 8 with 
BARON (SC model) and the customized OA (rGMA model) 



Solver 




k 2 


k 3 


k 4 


k 5 


ke 


OF 


OG (%) 


CPU (s) 


BARON (SC) 


6.24 


5.16 


0.46 


0.6 


8.46 


9.09 


60.36 


45.18 


3600 


OA (rGMA) 


6.25 


5.17 


0.45 


0.6 


8.44 


9.1 


60.46 


2.18 


10.95 
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into GMA form and using efficient global optimization 
methods developed specifically for this canonical repre- 
sentation. Although this technique will certainly have 
limitations, our previous results indicate that it can be 
applied to models of moderate complexity without 
major problems [32], Optimization of GMA models 
comprising up to 60 reactions and 40 metabolites offer 
no limitation to our technique. We have shown how 
these methods can be easily used to optimize SC via 
recasting into rGMA models while still being quite 
efficient. 

Our results can be of particular interest for dealing 
with multicriteria optimization on realistic models. This 
kind of problems are relevant when exploring the adap- 
tive response to changing conditions, were conflictive 
goals may be at play [39,40]. Particularly, we should 
notice that several multi-objective optimization techni- 
ques, such as the weighted sum or epsilon constraint 
methods [41] are based on solving a set of auxiliary sin- 
gle-objective problems. These approaches could directly 
benefit from the numerical advances presented in this 
work. This kind of problems are relevant when explor- 
ing the adaptive response to changing conditions, were 
conflictive goals may be on play [39,40]. The highly effi- 
cient OA algorithm applied to rGMA models provide a 
practical way for extending multicriteria optimization 
methods, for instance as used in [39], to non-linear 
kinetic models. It is in principle possible to make use of 
methods such as ours to analyze the optimality of large 
scale dynamic systems much in the same way that Flux 
Balance Analysis can be applied to analyze the stoichio- 
metry of an organism on a genomic scale. To make this 
possible, however, extensive experimental and modeling 
efforts would be required to characterize the most 
important properties of the involved processes. In fact, 
we anticipate that practical limitations to apply the tech- 
niques presented here in solving larger problems will be 
dominated by the lack of information about the compo- 
nent processes and metabolites rather than by the tech- 
nical capacity of the optimization technique presented 
here. Although a complete kinetic characterization of 
the processes in a complete metabolic network may yet 
be far, information on kinetic orders and saturation 
fractions is easier to obtain. In this context, the SC 
formalism provides a sound approximation that results 
in a mathematical representation useful for simulation 
and optimization through recasting. 

4 Conclusions 

We expect that the possibility of building models using 
non-linear approximate formalisms and of subsequently 
optimizing these models will trigger interest in the 
experimental characterization of the components of cel- 
lular metabolism. After the genomic explosion, we need 



to step back and begin to measure enzyme activities, 
metabolite levels, and regulatory signals on a larger scale 
than we used to do before, if we want to understand the 
emergence of the dynamic properties of biological sys- 
tems and to be able to develop successful biotechnologi- 
cal applications. 

5 Methods 

5.1 Modelling strategies 

The process of model building and optimization can be 
used to understand how a system should be changed in 
order to achieve specific biotechnological goals or how 
the same system has evolved in order to more efficiently 
execute a given biological function. Different trade-offs 
are considered during the modeling process. On the one 
hand, one wants to use models that are as simple as 
possible to guarantee numerical tractability. Unfortu- 
nately simplifications may lead to models whose accu- 
racy is only ensured for a limited range of physiological 
conditions. On the other hand, models that are very 
detailed and accurate over a wide range of physiological 
conditions are typically more difficult to analyze and 
optimize. Needless to say, the type of modeling strategy 
and the model one chooses to implement have a large 
impact on the results of the analysis. The most widely 
used strategies in the context of optimization are: (1) 
Stoichiometric models, (2) Kinetic models, and (3) 
Approximated models. 

The three strategies have as a starting point a set of 
ordinary differential equations, in which the dependent 
variables or nodes are the chemical species whose dyna- 
mical behavior one is interested in studying. For a sys- 
tem with n dependent variables, p processes and m 
independent variables, the node equations are written as 
follows: 

P 

Xi = /JLi r V r i = 1, .., Tl (10) 

r=l 

far stands for the stoichiometry of each metabolite X t 
in each reaction r with respect to metabolite i and can 
be derived from the reaction scheme. 

At this stage, the various strategies begin to differ in the 
way that they implement and analyze the equations. Typi- 
cally, Flux balance analysis (FBA) and related techniques 
consider only the steady state behavior of the system, and 
treat v r as a variable whose value can be changed in order 
to optimize specific steady state constraints. To accom- 
plish this, FBA-like methods attempt to find solutions for 
the following system of linear equations: 

P 

0 = ^ir y r i = 1/ n (11) 
r=l 
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This system of equations is solved under different 
assumptions. A typical problem is that of understanding 
the effect of knocking out different genes from the sys- 
tem. This analysis can be performed by setting v r - 0 
for the process(es) that depend on the product of the 
genes that are knocked out. Once these constraints are 
set, linear optimization techniques can be used to iden- 
tify the region of the variable space that satisfies the 
steady state and optimizes at the same time a set of spe- 
cific measurable aspects of the systems [42-44]. It must 
be noted that FBA analysis of Eqn. (11) does not 
account for the regulatory effects that can result from 
gene knockout and it cannot be used to predict changes 
in metabolic concentrations and temporal responses. 
Thus, optimization constraints are limited to steady- 
state fluxes [15]. 

To overcome these limitations, we must use more 
complex kinetic models where the effect of changing the 
values of the variables on the fluxes is taken into 
account. This requires defining a functional form for 
each v r in Eqn. (10). Often, this functional form is 
drawn from a number of classical enzyme kinetic rate- 
laws. As a result, we use an approximate expression for 
the kinetic behavior of each elementary process whose 
form depends on the underlying mechanism of the pro- 
cess. The reason for this is that the classical rate laws 
are rational functions of the variables and they are built 
upon different types of simplifying assumptions on the 
detailed mechanism of the reactions. Such assumptions 
range from considering that the elementary chemical 
steps of the catalytic process occur at very different 
timescales to assuming that the concentration of the 
catalyst and of the reactants differ in orders of magni- 
tude. Thus, rate laws such as the popular Michaelis- 
Menten are approximations to the actual mechanism in 
specific conditions. However, more often than not, one 
does not have enough information to judge if such con- 
ditions meet those one is trying to model. Thus, using 
rational enzyme kinetics in models lacks a sound theore- 
tical ground. In fact, within the complex architecture of 
the intracellular milieu, many of the assumptions that 
justify these classical rate-laws may not hold [45-47]. 
Even in the best case scenario where a detailed kinetic 
model using classical enzyme kinetics can be derived 
and numerically identified, it may be hard to globally 
optimize that model using general purpose algorithms. 
As we will show here, available optimization techniques 
may fail to solve fairly trivial optimization tasks even in 
simple models. These numerical difficulties can be over- 
come by defining reformulated models based on canoni- 
cal representations that are easier to handle using 
customized global optimization algorithms devised for 
specific canonical functional forms. 



As an alternative, theoretically well supported canoni- 
cal representations can be derived using approximation 
theory. One type of such representations are power-law 
models. In a power-law model, each v r in Eqn. (10) is 
approximated as [19,21]: 



Vr{X\, ..,X n , ...X n+m ) 



YrY\4 r=1 >~>P 



(12) 



This approximation is derived at a given operating 
point (Xi 0 ,X 2o , ..,X( n+m ) o ) as a first-order Taylor series 
representation of the target function in log-log space. 
This approximation can generate models with different 
representations. The two that are most commonly used 
are the S-system representation and the GMA represen- 
tation. The S-system representation is obtained by 
lumping the various processes that contribute to the 
synthesis of a given metabolite into a global process of 
synthesis and those that contribute to the utilization 
of a given metabolite into a global degradation process 
P 



r=l 
P 



r=l 



V + - V~ 
v i y i 



i = 1, .., n 



P P 
r=l r=l 

= V+-Vr i=l,..,n 



(13) 



Then, the aggregated processes are represented by 
power-law functions: 



Xi = a x n X) 



ftflV i=l,..,n 



(14) 



Alternatively, the GMA form is obtained representing 
each individual v r as a power-law: 



r=l 

p ( n+m 



(15) 
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The parameters in these representations have a clear 
physical interpretation. Kinetic orders, the exponents in 
the power-laws, are local sensitivities of the fluxes, 
either individual (f r j for v r ) or aggregated (g t j for Vf and 
hij for V f _ ), with respect to Xj, Rate- constants (a b /3, and 
y r ) are parameters that are computed so that the flux in 
the model at steady state is equal to the operating flux 
at the operating point for the metabolites. Parameter 
estimation techniques have been developed so that 
power-law parameters can be calculated from experi- 
mental measurements [13]. It should also be noted that 
the use of estimation procedures (i.e., least-squares), 
alternate regression or similar procedures to estimate 
power-law parameters from dynamic curves lead to a 
power-law representation that is no longer local accord- 
ing to the classical definition [48-50]. Those models 
may, by definition, slightly improve their accuracy over 
strictly local models. 

To complement the power-law approach, the Satur- 
able and Cooperative (SC) formalism was introduced by 
Sorribas et al. [23] as an extension of the ideas that led 
to the power-law formalism. The SC representation of a 
given velocity is: 



v r (Xit -,X n , ..X n+m ) 



v r n *T j 

n+m / x 



(16) 



This representation can be obtained from a power-law 
model defined at a given operating point X 0 = (X 10 ,.., X 
(n + m)o) through the following relationships: 



1 - Prj 



r = 1, 



,n + m 



K ri 



1 - Prj 

Prj 



A i0 



r = 1, ..,p 
j = l,..,n + m 



(17) 



(18) 



Thus SC uses the same information as the power-law 
except for the new parameters p r j (saturation fractions), 
which are defined as: 



canonical GMA model, through the introduction of aux- 
iliary variables, as will be shown in the next section. 

5.2 Recasting non-linear models into power-law canonical 
models by increasing the number of variables 

Non-linear models can be exactly recast into GMA or S- 
system models through the use of auxiliary variables 
[36]. As a result, the final model is an exact representa- 
tion of the original model, written in a canonical form. 
In other words, the resulting GMA model is not an 
approximation to the original model: it is an exact 
replica of it. To avoid confusion, hereafter, we refer to a 
GMA model that exactly recasts another as an rGMA 
model. 

As a very simple introductory example, consider a lin- 
ear pathway with two internal metabolites X 1 and X 2 
and a source metabolite X 3 (Figure 3). In this pathway, 
X 2 is a competitive inhibitor of the synthesis of X x from 
the source metabolite. A generic model using Michaelis- 
Menten kinetic functions, assuming a competitive inhi- 
bition of the first reaction by X 2 , can be written as: 



Xi 



x 7 



ViX 3 



Ki(l+Ki/X 2 )+X3 

v 2 x 1 



K 2 +Xi 



(20) 



v 3 x 2 2 



K 2 + X\ K 3 + X2 



in which X 3 is an externally fixed variable. 
Recasting this model as a rGMA can be done as fol- 
lows. First, let us define three new variables: 

X 4 = Ki{l+Ki/X 2 ) +X 3 

X 5 =K 2 +X l (21) 
Xg = K 3 + X2 

We can now write the model in 20 as: 

Xi = V1X3X4 1 - V2X1X5 1 
X 2 = V2X1X5 1 - VsXjX' 1 

with initial conditions Xi(0) = Xi 0 and X 2 (0) = X 2o . 



(22) 



Prj = Vro/Vrj 



r = 1, ..,p 
j = 1, n + m 



(19) 



where v r0 = v r (X 10 ,.., X n0f ... X (w + m)0 ) and V r j is either 
the limit velocity (saturation) when Xj — > 00 if n r j > 0, or 
the limit velocity when Xj — > 0 if n r j < 0. 

Using SC models for global optimization can raise 
some numerical issues. These difficulties can be avoided 
to a large extent by recasting SC models into a 
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To complete the recasting we must now provide the 
equations that follow the change in the new variables 
over time. These are given by the following equations: 

• KiKiX 2 



= V^KiX- 1 - V 2 K 1 K l X 1 X~ 1 
X~ 2 X 5 = Xi (23) 
= V1X3X4 1 - V 2 XiX 5 _1 
Xg = 2X2X2 

= 2V2X1X2X5 1 - 2V 3 X 3 2 X~ 1 

with initial conditions X 4 (0) = Ki(l + Ki/X 2o ) + X 3o , 
X 5 (0) = K 2 +X lo , and X 6 (0) = Kf + Xf Q . 

The resulting rGMA model (22-23) is an exact repre- 
sentation of model in (20). Hence, for a set of appropri- 
ate initial conditions, the simulation of the dynamic 
response using either the model recast as a rGMA or 
the original model will produce the same trajectory. In 
principle, any non-linear model can be recast into a 
rGMA following a similar procedure [36]. This can be 
extremely useful, because it allows for the application of 
tailored global optimization procedures originally 
devised for GMA models [28-30,32,51,52] to generic 
non-linear models. 
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